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This report describes the testing of four Model Output 
Statistics prediction methods on simulated data fields for 
the purpose of determining their relative skills in fore- 
casting a generic weather parameter (predictand) . Of the 
four methods, three use Bayes Law of Inverse Probability to 
discriminate, while the other method uses conditional prob- 
ability. The simulated data sets, models and observers 
necessary to accomplish this goal are created according to a 
uniquely developed simulation design. The results indicate 
that there is a definite difference in the ability of one of 
the four methods, namely the method using conditional prob- 
ability, to forecast the weather parameter. Through the use 
of the Analysis of Variance (ANOVA) technique, this differ- 
ence is found to be significant with respect to chance. 
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I. INTRODUCTION 



The Model Output Statistics (MOS) approach to fore- 
casting consists of' relating the numerical model output 
parameters, diagnostic and prognostic (predictors), to 
sensible operationally-important weather parameters (predic- 
tands), e.g. visibility, cloud amount, precipitation, for 
the purpose of enhancing the skill of forecasting these 
parameters (Glahn and Lowry, 1972). 

The first major MOS work was readied for application in 
the early 1970's by the National Weather Service (NWS), the 
weather arm of the National Oceanic and Atmospheric 
Administration (NOAA) . More recently, both the U.S. Navy and 
the U.S. Air Force have been involved in the development of 
MOS schemes. NOAA's programs are operational, both at 
civilian and military sites, with continuing development by 
the Techniques Development Laboratory (TDL) , • National 
Weather Service, Silver Springs, MD. 

The Navy's MOS effort began at the Naval Postgraduate 
School (NPS) in Monterey, CA in the mid/late 1970's, using 
visibility and fog over the North Pacific Ocean as the 
predictands of interest (Renard and Thompson, 1984; Koziara, 
Renard and Thompson, 1983 ) . These experiments were limited 
and the results not immediately operationally applicable. 
However, the studies formed the basis for a decision by the 
U.S. Navy to pursue the development of MOS for all oceans of 
the world, for a select number of air/ocean parameters. 
Following this decision in 1981, a series of studies were 
initiated at NPS, as a joint effort of NPS ' s Meteorology 
Department and the Naval Environmental Prediction Research 
Facility (NEPRF), Monterey, CA The first study (Karl, 1984) 
investigated the use of three conditional probability MOS 
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prediction schemes developed by Preisendorf er^ (1983a, b), 
using visibility at the model initialization time as the 
predictand and the output from Fleet Numerical Oceanography 
Center’s (FNOC) Navy Operational Global Atmospheric 
Prediction System (NOGAPS) model as predictors, applied to a 
limited homogeneous region of the North Atlantic Ocean. Karl 
intercompared the results of these three methods and 
multiple linear regression methods with variable thresh- 
olding, as proposed by Lowe (1984). Diunizio (1984) followed 
Karl with a similar experiment, but for additional homoge- 
neous Atlantic Ocean areas, and forecast intervals to 48 
hours, and with modifications to the MOS methodologies. The 
third study (Wooster, 1984) concentrated on cloud amount and 
ceiling, using essentially the same MOS methods as his pred- 
ecessors with further variations in the multiple linear 
regression threshold model. The most recent effort (Elias, 
1985), contrasted a new model, namely the Principal 
Discriminant Method (Preisendorf er , 1984), with the earlier 
methods on their ability to predict visibility. 

This study concerns the testing of three MOS prediction 
methods exercised by the previous NPS investigations (the 
Maximum-Probability Method II, the Multiple Linear 
Regression Method, and the Principal Discriminant Method), 
plus one additional method (Discriminant Analysis Method), 
on statistically-derived simulated (i.e., controlled) 
predictor/predictand data sets, with the goal of ranking the 
methodologies as to their relative skill in predictand 
specification. 



^Dr. R.W. Preisendorf er was the Naval Air Systems 
Command G. J. Haltiner Research Chair Professor in the 
Department of Meteorology, Naval Postgraduate School, 
Monterey, CA for 1983. Dr. Preisendorf er is currently affil- 
iated with NOAA's Pacific Marine Environmental Laboratory 
(PMEL) in Seattle, WA. 
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II. OBJECTIVES AND APPROACH 



There are three main objectives in this particular study- 
contributing to the U.S. Navy's MOS program. The first 
objective is to develop fields of simulated data that have 
controllable predictor/predictand parameters, such that the 
conditions of predictability of the data fields can be 
varied and the results made reproducible. The second objec- 
tive is to test four MOS prediction methods , using the simu- 
lated data fields, in order to determine their relative 
ranking in the skill of predicting the generic sensible 
weather parameter. The third objective is to determine the 
most skillful MOS method on which further testing could be 
concentrated . 

The approach taken to fulfill these objectives was to 
program the simulation procedures outlined by Preisendorf er 
(1985) so as to create the data fields. A training and three 
test sets were generated for each MOS methodology in order 
to provide ample scoring statistics upon which to analyze 
the results. For this study, two such training/ test ing sets 
of simulated data were generated with 1200 rows of nine 
columns, each column representing a real primary data field. 
One set was called 'the easy data set', while the other was 
called 'the hard data set'. The easy data set was one that 
the MOS prediction methods could easily make a prediction 
from, due to the relatively high correlation between the 
predictors and the predictand. The har d data set was one 
that would have less correlation between the predictors and 
the predictand, thus making a prediction more difficult. The 
two data sets were determined in such a way as to have a 
signal- to-noise ratio^ of 4:1 for the easy data set and 1:1 



o 

The signal- to-noise ratio is defined as the maximum 
ratio of tne between- class distance to the within-class 
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for the hard data set. Two numerical weather prediction 
models were also simulated in this study. The first is 
called the goo d model because it introduced less distortion 
and less noise to the signal, while the bad model produced a 
greater distortion and a larger noise component. Also in 
this study, three observers were simulated in order to look 
at the resultant variation in predictive skill of the four 
methods, as the skill of the observer varied. The three 
observers were: i) the perfect observer - one who never 

makes a wrong observation of the actual weather event; ii) 
the good observer - one who occasionally makes a wrong 
observation of the actual weather event, but not by more 
than one category; and lastly iii) the bad observer - one 
who makes a wrong observation of the actual weather event 
more often than the good observer, and sometimes by more 
than one category. 



distance of two (or more) multivariate populations given in 
the function: 

S/N = [pi ( 1 ) -pi (2 ) ] 2/a2 (for univariate situation), 

where a(l) is the mean of the first category of the three 
beingo compared, pi(2) is the mean of the other category 
and cj is the average variance of the two categories being' 
compared . 



12 



III. METHODOLOGIES OF THE MOS SIMULATED DATA SETS 



The sections that follow describe the procedures used to 
create the simulated data sets, the simulated models and the 
simulated observers for this study, as based on guidance 
provided by Preisendorf er (1985). 

A. THE MOS SIMULATION PROBLEM 

The problem of designing an MOS simulation process is 
threefold in nature. Besides designing the simulated natural 
data fields, there is also the problem of prescribing the 
skill of the observer viewing them, and of defining how 
accurately a numerical weather prediction model would repro- 
duce them. The observer, on the one hand, would be viewing a 
parameter, such as visibility at sea, and have to estimate 
it subjectively. On the other hand, the model will be repro- 
ducing output parameters such as pressure, temperature, 
winds, etc., that could be used by any of the MOS prediction 
methods to predict the visibility, or other sensible weather 
parameters . 

In setting up an MOS scheme, i.e., to train it to fore- 
cast the sensible weather parameter, the observations of the 
predictand must be coupled with the output parameters of the 
model. Then, when the method is to be tested, it will, 
according to its training, take a fresh model predictor set 
and produce a forecast of the predictand. Thus, in 
attempting to design a procedure using the simulated data 
sets, where there is built - in correlation between the real 
predictors, a simulation of the model fields must be done as 
well, including whatever biases and errors that are inherent 
to numerical weather prediction models. The ability of the 
observer must also be modeled so that it does a controllably 
imperfect job in estimating the weather parameter in ques- 
tion. This idea is illustrated in Fig. 1. 
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The left-hand column represents the real atmosphere or 
ocean as given to us by nature. The real primary fields are 
those which are normally measured or are known in principle. 
Specifically, the real primary fields are those which are 
normally measured or observed and incorporated routinely 
into the Newtonian equations of motion and the Laws of 
Thermodynamics.. The real secondary fields are those of fore- 
casting interest in this study, since the numerical models 
do not usually forecast them directly. The right-hand column 
represents the modelled atmosphere or ocean. The model 
primary and secondary fields here are only as good as the 
model that derives them and the human that observes them, 
respectively. It is recognized that there is always some 
level of error in these fields. The MOS methods in this 
simulation study will take the output parameters 
(predictors) from the modelled primary field, pair them with 
the simulated observations of the secondary field, and use 
these pairings subsequently to forecast the simulated real 
secondary field parameter (predictand) . 

B. THE MOS PREDICTION CONCEPT 

Now for a more detailed look at the MOS procedure 
outlined above. A generalization of the MOS approach to 
prediction of sensible weather parameters is shown in Fig. 
1. Some estimated value of a real secondary field parameter 
(such as visibility) is recorded by an observer for some 
location at time t. This estimate is called the estimated 
predictand since the secondary field is the field that is to 
be predicted using statistics derived from these estimates. 
Meanwhile, at the same location and time t the numerical 
weather prediction model produces values for the model 
primary fields. The model primary fields are called the 
model predictors. 

These model predictor/ estimated predictand pairings are 
taken over a large region of interest and provide what shall 
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be termed the Basic Data Set (BDS). An MOS method (X, Y, 

. . . , Z) is then trained on a part of the BDS to forecast a 
value of the predictand field from the set of model 
predictors it is given. This method is then tested on the 
remaining part of the BDS as follows. 

1) A set of model predictors is chosen from the testing 
part of the BDS, using the predictors obtained during 
the training stage. 

2) The method then produces a forecast of the predictand 
from this set. 

3) This forecast predictand is then compared with the 
estimated predictand that was paired with the set of 
model predictor values used. 

4) Then a skill score is assigned to the forecast of this 
particular method. 

5) This procedure is then repeated for the other MOS 
methods to be tested. 

Once all of the scores have been attained, the various 
methods are intercompared using these skill scores and some 
assessment is made as to their relative abilities to accu- 
rately predict the estimated predictand value from the given 
set of model predictors. The important assumption made here 
is that the methods will have the same relative ranking 
using the simulated data as they would have when subseq- 
uently tested with a fresh set of model predictor values 
from an actual data set of model output. This assumption 
will be well-founded provided that: 

1) The simulated data set is representative (in a statis- 
tical sense) of the real primary and secondary fields 
encountered in nature; and if 

2) The errors of the models and observers have been 
well- simulated. 

C. SIMULATING THE REAL PRIMARY FIELDS 

The real primary fields, which may be thought of as time 
series at p fixed points in space, were generated from a set 
of pxp covariance matrices using a formulation scheme 
suggested by Preisendorf er (1985, Appendix C) . The field-to- 
field (i.e., cross-) correlations are represented by the 
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matrix M, while the time lagged field-to-f ield correlations 
are represented by pxp matrix K. The scheme was developed 
with the goal of saving on computation time by defining the 
covariance matrices to be associated with stationary 
processes in space. For example, suppose m(x,x') is the 
entry of M in row x and column x' , then the matrices are 
stationary, in that the covariances depend only on the 
differences (x-x') of the arguments x and x' in m(x,x'), so 
that it may be written as m(x-x') and k(x-x') for K. In 
this context, the eigenvalues and eigenvectors of M and K 
are expressible using simple algebraic formulas. The need 
for various matrix manipulating subroutines is thereby 
eliminated . 

The formula used to create the pxp M matrix is: 

m (x , x ' ) = exp ( - ju • |x-x'|) (3.1) 

x,x' = 0, ...» p-1 for 0 < ji < oo , where n is the variable 
that controls the field- to- field correlation. The pxp K 
matrix is formed similarly: 

k(x,x' ) = K a • exp (-/x • | x - x 1 | ) (3.2) 

x,x' = 0, ..., p-1 for 0 < K a < 1 , where x a is the variable 
that controls how much of a lag-induced difference there is 
between matrices M and K. (This is a simplified version of 
a more general approach in Preisendorf er (1985, Appendix 
B).) In this study, p was set to equal 9, so that the time 
series used simulated nine points in space. 

For this study two separate data sets were formed. The 
difference between them is the amount of correlation and the 
amount of lag defined for the 9x9 matrices M and K. The 
degree to which the set of predictors produced by the model 
were correlated and also the amount of lag-effect between 



16 



the pair of matrices determined whether or not the MOS 
prediction method would have an easy or a hard time fore- 
casting the sensible weather paramater (predictand) . Table I 
gives an example of the covariance matrices M and K with [i = 
0.15 and *c a = 0.90. These values provide a set of covariance 
matrices that are well- correlated and thus labelled the 
'easy* data set. The values for the variables of the ’hard' 
data set are jx = 0.50 and /c a = 0.60. This produced a corre- 
lation that is much lower and a lag-induced effect that is 
greater. Therefore, the forecast made from this Basic Data 
Set is more difficult. The lower half of Table I shows these 
matrices from the hard data set. 

The range of values of the x,x' pairs is especially 
tailored for the spatially stationary context, and in fact, 
arithmetic modulo p must be used on the spatial- index x-x' 
values that do not fall within the prescribed range. For 
example, if x-x' is not in the set {x: 0, ..., p-1}, x-x' 
must be reduced modulo p to map it into the finite set. The 
diagram in Fig. 2 illustrates this idea for the case p = 9. 
The range of x,x' can be visualized as being on a circle 
where the 0,..., p-1 ( = 8) values are plotted on it. Any 
values of x-x' outside their range are wrapped around the 
circle modulo p, so, in this way, all members of the set of 
integers (each representing the spatial location of a time 
series) can be handled. 

By construction, the values of m(x,x') and k(x,x' ) 
depend only on x-x', and by symmetry we have m(x,x') = 
m(x' ,x) and k(x,x' ) = k(x',x). Hence, from now on m(x,x') = 
m(0,x-x') can be written as m(x-x'), i.e., with only one 
argument. In relating the diagram in Fig. 2 to the matrix 
values in Table I, notice that m(8,7) = m(l), while m(7,8) = 
m(-l) = m(8), and since, by symmetry, m(7,8) = m(8,7), it 
can be seen that m(8) = m(l). The same holds true for matrix 
K. Table II shows M and K in the simplified notation. 
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The data sets were generated using the following autore- 
gressive equation: 



Z(t,x) = 0.5 • i/(0) • b (0 , t ) ♦ (2 i»(j) * ( 3 . 3 ) 

[b ( j , t ) • cos(/c(j) • x) + c(j,t) • sin(/c( J ) • x)], 



where m=4 and p = 2»m+l for j=0, . . , m, with t£j (the set of 

integers); x=0, p-1; v ( j ) = (A ( j ) /p ) 1 ^ 2 ; k( j ) = 

(2»7T*j )/p. 

The terms used in Eq . ( 3 . 3 ) will be elaborated on indi- 
vidually or in pairs in the following paragraphs, starting 
with the first equations derived from Eqs.(3.1) and (3.2). 

The \(j) term found in the expression for j/(j) is the 
set of eigenvalues determined for the M matrix by the 
following formula: 



A(j) = 1.0 + 2.0 • (2 m(x) 

X— 1 



cos[x(j) • x]} 



(3.4) 



with \ (p- j ) = \( j ) using modular arithmetic, and 
x(j) = (2*7 t*J )/p> for j = 1, ..., m. Through this variable 

the fie Id- to- field matrix correlations were expressed. 

The autogressive correlations p(j) were given by the 
f o rmu 1 a : 



p(j) = k(0) + 2.0 • (2 lc(x) • cos[x(j) * x]}/ A(j) (3.5) 

for j = 0, ..., m; with p(p-j) = p(j) for j = 1, ..., m 
again by modular arithmetic. This relation takes into 
account the lag correlation and is used to obtain the vari- 
ances for the random forcing of Eq . (3.3). These variances 

were obtained through use of the formulae: 

a 2 (j3 > J ) = 2.0 • (1 - p 2 (j)) • ( 1 + 5(0, j)) (3.6) 
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<j 2 (y,j) = 2.0 • (1 - p 2 (j)) * ( 1 - 5(0, j)) 



(3.7) 



for j =0, m , where 5(0, j) is a special case of 

Kronecker's Delta function 5(i,j) with i = 0. In 

general, 5(i,j) = 1 if i = j , and = 0 if i =£ j . 

The square root of the variances obtained in Eqs.(3.6) 
and (3.7) (i.e., the standard deviations) are used in a 
random number generator subroutine to get the random 

forcing terms, such that their values are normally distrib- 
uted with a zero mean, i.e.: 

j3(j ,t) - N(0,a 2 (j3 >j)) and y( j , t ) - N(0 , a 2 (y ,j)) (3.8) 

o o 

Observe that a ([ 3 ,j) = o (y ,j) for j = 1 , ..., m, and 

that cr 2 (/3 ,0) = 4»[l-p 2 (0)], while a 2 (y ,0) = 0. 

The autoregressive correlations p(j) are used again 
along with the random forcing terms to develop the time- 
dependent coefficients b(j,t) and c(j,t) via the following 
formulae : 

b ( j , t ) =p(j) • b ( j , t- 1 ) + j3(j,t) (3.9) 

c(j,t) =p(j) * c(j,t-l) + y ( j , t ) (3.10) 

for j = 0 , . . . , m . ^ 

D. SIMULATING THE REAL SECONDARY FIELD 

The next step in the simulation process is the modelling 
of the real secondary fields (predictands ) . This required 
defining a link between the real primary and secondary 



3 

The random number generator used is NPS s W.R. Church 
Computer Center's library subroutine, GGUBS , which creates 
variates with uniform distributions. These uniform distribu- 
tions are changed into gaussian normal distributions through 
the use of the method of Box and Muller (1958). 

^Notice that c(0,t) is uniquely zero due to the fact 
that a (y ,0) = 0. 
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fields. The fact of the matter is that there are, in many 
cases, no readily available algorithms which can be applied 
to the real primary fields to obtain the real secondary 
fields. Indeed, this is at present a vigorously pursued set 
of problems in meteorology and oceanography. This fact is 
the basis for the need of MOS-type procedures in the first 
place! Fortunately, in this simulation study the real forms 
of such linkages are not needed. Therefore, in order to 
provide the linkage desired, it is sufficient to invent some 
reasonable-appearing relations. Therefore, predictors could 
have been combined together in any number of different ways, 
since the number of algebraic and analytic possibilities are 
endless; for the present study the method chosen is the 
following. There are in the set of real primary fields 
generated by Eq.(3.3), nine columns of predictor time 
series, Z(t,0) through Z(t,8), for t = 1, ..., 1200, where 
1200 is the number of entries in each column. By taking one 
of these columns and relabelling it the real secondary 
field, the desired effect of having a linkage between the 
predictors and the predictands is assured. 

So now, by following this procedure, there is one 
predictand column and eight predictor columns with a corre- 
lation factor and a lag factor that can be controlled by the 
use of the [i and the /c a terms. This method of generating 
predictors and predictand is termed the In-House Field 
Method , see ( Preisendorf er , 1983 ; Appendix B). 

For this study, the predictand column chosen is the 
Z(t,0) - column because of the symmetry of the values to 
either side of it. The correlation values are highest for 
the nearest two neighbors, Z(t,l) and Z(t,8), and decrease 
in order going away from Z(t,0) in either direction. The 
correlation of the eight predictor columns to the predictand 
column is shown in Fig. 3 and Fig. 4 for the easy and hard 
data sets, respectively. 
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Next, the predictand column is sorted by entry magni- 
tudes, with the entries of lowest magnitude being placed at 
the top of the column, while those rows with the largest 
magnitudes are placed at the bottom. At the same time, the 
other eight columns of predictors are also sorted. In this 
way the row relationship between the predictand (Z(t,0) 
value) and the predictors (Z(t,l) through Z(t,8)) is main- 
tained. Once this is accomplished , the values in the predic- 
tand column are grouped in the following manner. 

1) For the upper 400 entries, those with the lowest numer- 
ical values are tagged with the value 1 and comprise 
category 1 (analogous to forming the below tercile in 
meteorological practice). 

2) For the middle 1700 entries, those between the two 
extremes are tagged with the value 2 and comprise 
category 2 (forming thereby the normal tercile). 

3) For the lowest 400 entries, those with the largest 
numerical values are tagged with the value 3 and 
comprise category 3 (forming the 'above' tercile). 

The middle category, category 2, contain the most 
entries because of a desire to keep the variances for the 
three categories equal, or nearly so. This proved to be 
quite a challenge, and only after many experiments with 
various category sizes was it accomplished. Fig. 5 shows an 
example of the three categories with equally populous inter- 
vals (400 entries in each) . The high narrow spike in the 
middle category indicates that the variance for that 
category was much less than either of the two categories on 
the wings, and a small variation between the training set 
and the test sets led to widely different verification 
scores. This result was unsatisfactory, and so a different 
interval size was sought that would not lead to this type of 
instability. The interval size that was settled on^ was to 
have the interval size determined by the standard deviation. 



This idea of interval spacing using the standard devia- 
tion as a measure was suggested by Lowe in a private discus- 
sion, and upon testing proved to have the stability desired 
when there were at least 400 entries per predictor column. 
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The overall interval spacing for a gaussian (normal) distri- 
bution can be defined by ± three standard deviations either 
side of the mean. Hence, for this study, the middle two 
standard deviation (sigma) intervals are defined as the 
middle category, category 2, while the two outer sigma 
intervals on either side are defined as category 1 and 
category 3 as shown in Fig. 6. 

The final step for the predictand- category defining 
process is to randomly sample down from the 1700 entries in 
the middle category to obtain the desired 1200 entry size 
for each column, which means that each category subset has 
400 values, and also nearly the same variance. 

E. SIMULATING THE MODEL PRIMARY FIELDS (MODEL PREDICTORS) 

The counterparts to the real primary fields are the 
model primary fields, as shown in Fig. 1. The model primary 
fields are the imperfect versions of nature's real primary 
fields since they contain some distortion and noise due to 
the inability of man to model the atmosphere and ocean 
accurately . 

The model imperfection is simulated in this study 
through the following equation: 

X(t,x) = 2 St(x,x') • Z(t,x') + n ( t , x). (3.11) 

X=1 

Here the model primary field (X(t,x)) is produced first by 
having the real primary field (Z(t,x)) distorted through 
multiplication with a matrix S T (x,x') that consists of a 
fraction on the diagonal and zero elsewhere, and second by 
having some noise added to it, element by element. The St 
matrices used in this simulation are shown in Table III. 
They are special cases of the more general linear transfor- 
mations possible on the Z(t,x) field. 

The value on the diagonal for the good model is 0.95 and 
for the bad model; it is 0.50. The noise term n(t,x) is 
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created using the same random number generator as before, 
only now different values are used for the variances to 
control the amount of spread about the zero mean. The 

O 

normal distributions obtained are generated with a - 0.01 

O 

for the good model, and with a - 0.25 for the bad model. 
Thus, the good model has 95% of the original Z(t,x) value, 
plus a random perturbation from the centered normal distri- 
bution with standard deviation a - 0.1. The bad model has 
only 50% of the original Z(t,x) value, and a perturbation of 
standard deviation o = 0.5. 

F. SIMULATING THE OBSERVED REAL SECONDARY FIELDS (ESTIMATED 

PREDICTAND) 

The final simulation requires the creation of an 
observer who will make estimates of the real secondary field 
(estimated predictand). These observations with their 
inherent errors will be combined with the model predictors 
by the MOS prediction methods and be used to forecast the 
real secondary field parameter as shown in Fig. 1. 

For this study, the estimated predictand was created 
with marine atmospheric visibility in mind as the sensible 
weather parameter being forecast. So the estimates of the 
predictand are grouped into discrete categories according to 
whatever limits are desired to separate the categories. In 
this study the generic predictand (considered as marine 
atmospheric visibility) was grouped into three categories - 
1, 2, 3, representing good, marginal and bad visibility, 

respectively. In making an estimation of the predictand, the 
observer may correctly choose the category of the actual 
event, or the observer may miss and choose one of the other 
two categories. For example, if there is only marginal visi- 
bility occurring at some place, for some time t, then the 
observer can choose category 2, and be correct, or category 
1 or 3, and be wrong. A good observer would have the 
ability (or skill) to make the correct estimate most of the 
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time, but a bad observer wouldn't. Table IV shows a 3x3 
table which summarizes the relative frequency e(i,j) of the 
observer estimating category i when, in fact, category j 
occurs in nature. The e(i,j) values are normalized so that 
each of the columns sum to 1. Here a perfect observer would 
have e(i,j) = 0 when i ; j, and e(i,j) = 1 when i = j, for i 
= 1,2,3. A bad observer would have greater off-diagonal 

values indicating an inability to observe correctly. 

The observer's estimate is modeled in the following 
manner. The real secondary field (predictands ) , described 
earlier in Section D, are arranged in ascending order 
according to their numerical values, and relabelled 1, 2, 3, 
such that the upper 400 values equal 1, the middle 400 

values equal 2, and the last 400 values equal 3. To simulate 
the observer's role a uniformly distributed number u is 
chosen on the interval I = [0,1] which has been partitioned 

according to the observer's skill, i.e.: 

A ( 1 , j ) = { u: 0 < u < e ( 1 , j ) } (3.12) 

A ( 2 , j ) = { u: e ( 1 , j ) < u < e ( 1 , j ) + e(2,j) } 

A ( 3 , j ) = { u: e ( 1 , j ) + e(2,j) < u < 1 }, 

so that I = A( 1 , j ) + A(2,j) + A(3,j), j = 1, 2, 3. For 

example, if the number u, chosen randomly, falls into 
A(2,j), then the category assigned will be category 2 for 
the estimated predictand when in fact category i,j = 1,2,3 

occurs . 

£ 

The three observers simulated in this study have skills 
decreasing from 100% for the perfect observer, to 87% for 
the good observer, and still lower to 69% for the bad 
observer. Their respective skills are shown in Table V. 



r 

By skills it is meant the ability to observe correctly 
the actual weather event which is occurring at the time of 
observation. This ability can be defined by the calculation 
of the average of the main diagonal elements of the veri- 
fying relative frequency (i.e., contingency) table. 
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IV. THE MOS PREDICTION METHODS AND THEIR RESULTS 



A. METHOD DESCRIPTIONS 

1 . Principal Discriminant Method 

The Principal Discriminant Method (PDM) was proposed 
by Preisendorf er (1984). This method is basically a discrim- 
inant method in that the data are partitioned into predic- 
tand classes and forecasts are made by forming probabilities 
for the predictor as to whether it belongs to one predictand 
class or the other. This particular method of discrimination 
is distinguished by the fact that it fits a gaussian prob- 
ability density distribution (for example) to each predic- 
tand category subset, using a principal component analysis 
of the data points in the category subset. The method’s 
most novel feature is that, if the categorical distributions 
are significantly non-gaussian, then a successive, 
controlled splitting of the category subset is performed in 
the local principal component coordinate frame to obtain a 
better fitting of probability density functions. With each 
test set, fresh values • of predictors are used and the asso- 
ciated probability density values for each category are 
found. The forecast of the predictand is then made using 
Bayes Law of Inverse Probability. 

The Principal Discriminant Method also contains a 
methodology for predictor selection. Although each of the 
four methods to be tested has its own method of predictor 
selection, the PDM was chosen to select the three predictors 
to be used by all of the methods. The basic groundwork for 
the PDM and its variable- selection process was programmed by 
Elias (1985) in three separate programs (a single predictor 
screening program, a predictor correlation program and a 
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multiple predictor program). Hale^ subsequently made addi- 
tional modifications to these programs to more closely 
follow Preisendorf er ' s original formulation. 

The procedure used to select variables and implement 
the PDM is now as follows. 



1) The initial screening program is run where each 
predictor is separately ranked according to potential 
predictability (PP) and potential total percentage 
correct (PAO) for the training set. The first predictor 
chosen is that with the highest PP . 

2) Then the correlation program is run to find the next 
candidate predictor to be added such that it is the 
least correlated to those already chosen. 

3) The PDM multiple predictor program which implements the 
methodology described above is then run on these 

E redictors chosen in step 2. If the PP and PAO scores 
oth increase then the candidate predictor chosen in 
step 2 is added. If the PP and PAO ao not increase then 
step two is repeated. The process is carried out until 
three predictors are chosen. 

4) With the three chosen predictors, the PDM multiple 
predictor program is run using the test sets. The 

f rogram compiles a contingency table and computes veri- 
ying statistics. 

2 . Maximum-Probability Method II 



The Maximum-Probability Method II (MaxProb2) was 
also proposed by Preisendorf er (1983a, b) and exercised in 
all of the previous MOS studies. It differs from the 
discriminant methods in that the forecasts are based on 
conditional probabilities for given values of the predictor. 
To accomplish this, the data are first classed into a 
n-dimensional predictor space. The discrete cells into which 
the points are placed are of a size determined by dividing 
each predictor into equally populous intervals. Then within 
each cell the number of points belonging to each predictand 
class is tallied and the conditional probabilities for each 
predictand class are formed. Thus, each specific cell has 
its own conditional probabilities and the predictand 



^Robert A. Hale, NPS , joined the NPS/NEPRF MOS project 
team in February 1985. He has principal responsibility for 
the management of the MOS-archived data fields and Fortran 
programs . 
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category which has the maximum conditional probability is 
forecast for that cell. With each test set, fresh values of 
predictors are used to identify the cell location in the 
predictor space and the forecasted predictand category is 
returned. A contingency table of forecast versus observed 
values is then formed from which the verifying statistics 
can be calculated. 

The procedure used in this study employed two 
Fortran programs written by Karl (1984). 

1) First, a program is run which determines the number of 
equally populous intervals into which each predictor 
should be divided . 

2) A program which implements the Maximum-Probability 
methodology for a multiple number of predictors is run 
second. In this study, the three predictors were chosen 
by the Principal Discriminant Method. 

3 . D iscrimination by Dimension Reduction using 

Regression 

The Discrimination by Dimension Reduction using 

O 

Regression (DDRR) method was proposed by Lowe. This method, 
in combination with various thresholding techniques, has 
been used in all previous NPS MOS studies since Karl (1984), 
where it was called the Multiple Linear Regression method. 
The method uses the BMDP Statistical Software [University of 
California, 1983] programs - P1R, P5D and P4F. The P1R 
program carries out a multiple linear regression on the 
input predictor distributions; The P5D program displays 
various statistics (mean and variance of the classif ication 
functions) and histograms for the dimensionally- reduced 
estimated variable, Y, produced by the P1R program; whereas, 
the P4F program produces a multiway frequency display which 
uses the P5D output statistics with the predictor values of 
the test sets to form a contingency table from which the 
verification scores and other statistics of interest can be 



O 

°The method was described in a private conversation and 
developed for this study based on a set of example programs 
provided by Lowe . 
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obtained. The procedure followed for this method is outlined 
below . 



1) The P1R program performs a dimension reduction using 
linear regression. The regression equation which is 
derived from the predictor values of the training set 
by the P1R program is: 



Y = C(0)+C(1)*X(1)+C(2)*X(2)+C(3)*X(3), (4.1) 



where C(0) is the intercept and C(1J, C(2) and C(3) are 
the regression coef f icients . This linear least squares 
fit produces a new variate, Y, which contains all the 
information of the three predictors. For each test set, 
the respective values of X(l), X(2) and X(3) predictors 
will yield a fresh value of Y. 

2) The dimensionally-reduced Y variate is not a prob- 
ability, but is used as an index or proxy in the 
discriminant procedures to be implemented next. The Y 
variate is grouped by the predictand categories and the 
P5D program is used to obtain the mean (a) and variance 
(cr ) statistics of the classification functions. 
Gaussian probability density functions are fit to these 
newly formed groups via the following equation: 

L(m,n) = exp (-0.5 • [(Y - [x (m ) ) / a 2 (m ) ] 2 - (4-2) 

[(Y - [x(n) )/a 2 (n)] 2 } , 

where m,n = 1, 2, 3 forming six discriminant functions. 

3) Then Bayes Law of Inverse Probability is used as a 
transform statement in the P4F program, using these six 
discriminant functions to discriminate the category of 
the predictand by choosing the one with the maximum 
probability value in Eqs.(4.3) through (4.5). These 
probabilities are given by: 



P(l) = 1 


/ 


(1 + L(2,l) 


+ L(3,l)) 


(4.3) 


P ( 2 ) = 1 


/ 


(1 + L(l,2) 


- L( 3 , 2 ) ) 


(4.4) 


P(3) = 1 


/ 


(1 ♦ L( 1 , 3 ) 


- L(2,3)) 


(4.5) 



4 . Discriminant Anal y sis Metho d 

The Discriminant Analysis Method (DISC) was also 

Q 

proposed by Lowe and uses Fisher's classical discriminant 



'xhe DISC method was obtained in the same fashion as the 
DDRR method. 
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analysis design. The procedure consists of using the BMDP 
Statistical Software [University of California, 1983] 
programs - P7M and P4F. The P7M program performs a stepwise 
discriminant analysis of the input predictor distributions, 
while the P4F program is used, as with the DDRR method, to 
produce the multiway frequency table information. The proce- 
dure employed is. 



1) Run the P7M program on the training set for the three 

P redictors chosen by the PDM and obtain the classifica- 
ion functions C(i,j), where i is the index of the 
predictand category and j is the index of the predictor 
variable and the intercept. 

2) Calculate from the set of classification functions the 
set of six discriminant functions needed to perform the 
discrimination by taking the differences between the 
elements to form the coefficients as seen in Eq.(4.6). 
Enter these values in the transform statement of the 
P4F program. 



L(m, n) = exp{ [C(m,l)-C(n,l)] # X(l)+[C(m,2)-C(n,2)] (4.6) 

*X(2)+[C(m,3)-C(n,3)]*X(3)+[C(m,4)-C(n,4)]}, 

where X(l), X(2) and X(3) are the variables to be 

filled by the three predictors of the test sets and m,n 
— 1 » 2 , 3 . 

3) Then these discriminant functions are used in Bayes Law 
of Inverse Probability to calculate the probabilities 
P(l), P(2) and P ( 3 ) of the predictor set belonging to 
categories 1, 2 and 3, respectively. Eqs.(4.3) through 
(4.5) are used here also to find the probabilities. 

4) The P4F program using the discriminant functions and 
the test set predictors compiles a contingency table of 
the forecast versus observed categories of the predic- 
tand from which the verifying scores are calculated. 



B. RESULTS 

The successful simulation of the data, models and 
observers for this study is apparent from the verification 
scores^ calculated. The perfect observer achieved the best 
scores for all the data sets (highest values for the AO and 
TS1 scores, lowest values for the A1 score), while the bad 
observer had the worst scores; the good model did better 



■^See Appendix A for verification score definitions and 
comments . 
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than the bad model in all scores calculated; and the easy 
data outscored the hard data by a wide margin. These results 
are tabulated in Tables VI - VIII. 

The PDM, DDRR and DISC methods appear to have remarkably 
similar values for the AO, Al and TS1 verification scores. 
However, the MaxProb2 method has markedly different values. 
On the one hand, it scored better than the other three 
methods- -higher values for AO and TS1, lower values for 
Al--in the training set. This is probably due to the 
discrete way MaxProb2 pairs the predictand categories to the 
predictor intervals. On the other hand, it scored worse in 
the test sets. This gap between the training set scores and 
the test set scores occurs only with the MaxProb2 method. It 
appears evident that the MaxProb2 method will provide an 
extremely good fit to the training set data, if given enough 
intervals. However, if the least-squares-fit of the data of 
the testing set does not match that of the training set, 
then the verification scores show minimal skill. The other 
three methods have lesser differences between the training 
set and their test sets.' 
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V. ANALYSIS OF VARIANCE (ANOVA) INTERCOMPARISONS OF THE MOS 
PREDICTION METHODS 

A. INTRODUCTION TO ANOVA 

The variations of one or more factors (prediction 
method, data set, model, observer) involved in this MOS 
study can be analyzed effectively by the technique of anal- 
ysis of variance (ANOVA) . This technique allows the vari- 
ance of the measured variable (in this case, skill score) to 
be broken down into the portions caused by several factors, 
and interactions of those factors, whether varied singly or 
in combination, and a portioii attributed to experimental 
error. ANOVA consists of: 

1) A. partitioning of the total sum of the squares of devi- 
ations of the skill score from the mean into two or 
more component sums of squares , each of which is asso- 
ciated with a” particular factor or with experimental 
error, and 

2) a parallel partitioning of the total number of degrees 
of freedom. 

When certain variations of a factor are singled out for 
study because they are considered to be of more importance 
or interest, then ANOVA can be used as a comparison of the 
mean effects of those certain variations. Statistical tests 
(F tests) are made to determine whether the observed differ- 
ences are probably real. If the differences are judged real, 
the main effects and interactions of the population may be 
estimated quite easily. 

The ANOVA table utilized is described below. H 

1) The first column lists the sources of variation and 
indicates which of the sources are being varied. For 
example, source ABC has three sources of variation with 
source D being held fixed. 




of the ANOVA printouts , see 
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2) The second column lists the sums of the squares from 
all sources listed in column one. 

3) The third column lists the number of degrees of freedom 
associated with the sources in column one and is calcu- 
lated by subtracting one from the number of items being 
varied . 

4) The fourth column lists the mean of the square devia- 
tions and is calculated by dividing the sums of squares 
by the degrees of freedom. 

5) The fifth column lists the F test value observed for 
the sources in column one. 

6) The sixth column lists the F test critical value which 
must be lower than the value in the fifth column if 
significance is to be shown. 

7) The seventh column lists the Pvalue which is the actual 
probability that the variation observed is due to 
chance. Thus, a very low value would indicate that the 
observed variation is not due to chance. 



B. THE INTERCOMPARISONS OF ALL FOUR METHODS USING ANOVA 

Tables IX - XI contain the intercomparisons of the four 
MOS prediction methods using ANOVA, for the AO, Al and TS1 
scores, respectively. In these tables it can be seen that 
the variability source A (prediction method) is significant 
as compared to chance since the Fobs value is greater than 
the Fcrit value. Thus, there is a real difference between 
the methods that is more than just random. The large values 
evident for the Fobs terms for variance sources B, C and D 
(data, model and observer, respectively) are due to the 
differences simulated in their generation. Of interest is 
the fact that the variation due to the differences in the 
data sets is nearly twice that of the variations due to the 
differences in the model versions or the observer types. The 
negative values on the tables are due to the computer not 
properly handling very small values. 



C. THE INTERCOMPARISONS OF THREE METHODS, MAXPR0B2 REMOVED 
The large Fobs term in the preceding tables for source A 
was due to differences between the methods. As seen in 
Chapter IV, Section B, the only method that appears to have 
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markedly different values is the MaxProb2 method. The 
results of an ANOVA comparison, when MaxProb2 is removed, 
can be seen in Tables XII - XIV, for the AO, A1 and TSl 
scores. It is to be noted that the Fobs terms for source A 
in each is now less than the Fcrit terms, indicating that 
there is no significant difference between the three 
discriminant methods. The data variation is still nearly 
twice as large as the other two simulated variations. 



D. THE INTERCOMPARISONS OF THREE METHODS, 2X2 

Finally, a third set of ANOVA tables, this time a 2x2 
comparison of the three remaining MOS prediction methods for 
each of the verification scores AO, A1 and TSl, can be seen 
in Tables XV - XXIII. This last comparation shows that there 
is no significant difference between either the PDM, DDRR or 
DISC methods, for any of the verification scores, regardless 
of how the methods are grouped. 
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VI. CONCLUSIONS AND RECOMMENDATIONS 



A. CONCLUSIONS 

First, the DDRR, DISC and PDM all have similar skill for 
all variations of model, observer and data when tested on 
the simulated data. 

Second, the MaxProb2 method is not on the same perform- 
ance level as the three other methods tested, and should be 
modified in future test studies with the recommendation by 
Preisendorf er that follows (see the final recommendation 
below) . 

Third, further co-evaluation of the three remaining MOS 
prediction methods should be continued with a goal of iden- 
tifying statistically significant MOS predictive schemes for 
specific forecast problems. 

Finally, the results from the Anova intercomparisons 
indicate that the variation due to the difference in the two 
data sets is the most important factor to be considered when 
seeking skillful MOS prediction schemes. Therefore, data 
collected must be carefully analyzed by objective methods 
(so as to identify those predictors whose values possess a 
greater separability when grouped according to predictand 
categories) in order for the resulting MOS forecast to be of 
potentially high operational worth. 



B. RECOMMENDATIONS 

First, the predictive skill of the MOS methods, when 
predictand categories have unequal frequencies, needs to be 
addressed. For example, the rare event (one which occurs 
very infrequently, i.e., less than 10% of the time) is of 
particular interest. 
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Second, the predictive skill of the MOS methods, when 
the predictor's class conditional populations (as determined 
by the predictand categories) have significantly unequal 
variances, also needs to be addressed. 

Third, different types of parametric probability distri- 
butions, other than the normal distribution used to generate 
the data fields, need to be developed so as to ascertain the 
relative skill of the MOS predictive methods when confronted 
with significantly non-normal data. It may be important in 
the future to relax the spatially stationary feature in the 
simulated data equations in order to achieve this recommen- 
dation. 

Fourth, predictand formulation methods other than the 
predominantly linear In-House Field method used in this 
study need to be investigated. The scientific literature 
contains many specific algorithms, which are non-linear, 
that connect real primary and real secondary fields (e.g., 
vapor pressure, humidity, wind, etc.), these can be profit- 
ably used in future simulation studies. 

Fifth, provide a stronger foundation for future MOS 
studies by developing new techniques for the screening of 
predictors and predictor selection. 

Sixth, the stochastic skill of the MOS prediction 
methods needs to be examined. In this study, a categorical 
forecasting procedure was used where the skill statistics 
are computed from a contingency table. However, an alter- 
nate forecasting procedure would use the actual predicted 
probabilities of belonging to a given predictand category. 
With such a probabilistic method, the stochastic skill is 
defined by the sharpness of the forecasted probabilities. 
For example, consider a two - category problem where one 
method predicts a 90% probability of belonging to category 
one and 10% to category two. Then this method is said to be 
sharper and of higher predictive skill than a method which 
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forecasts corresponding probabilities of 55% and 45% for the 
same problem. 

Finally, the Maximum-Probability Method II strategy of 
choosing the category with maximum probability may not 
always be optimal. An alternate scheme would be to choose a 
category randomly, using its computed probability as a guide 
for the choice by a random number generator. Such a tactic 
would likely produce higher skill scores, overall, i.e., on 
average, than the MaxProb2 strategy. The basis for this can 
be demonstrated mathematically, using the Brier (1950) skill 
score, when the predictions are in terms of the probability 
of a category. However, it may also be described intui- 
tively. Suppose, e.g,, that the three probabilities of the 
predictand are 0.2, 0.5 and 0.3 for bad, marginal and good 
visibility for a certain realized predictor. The MaxProb2 
strategy always directs the selection of the marginal 
category. If this strategy is followed many times, then the 
marginal category will be picked 100% of the time, and the 
low or high categories will not ever be picked. But the 
latter two occur 50% of the time, collectively. On the other 
hand, randomly choosing categories will allow the low and 
high categories, collectively, to be chosen 50% of the time. 

Clearly, by including the low and high categories in this 

1 7 

way, a higher skill score would result. This new strategy 
should be implemented henceforth in all further studies of 
the modified MaxProb2 method. 



1 9 

This new strategy and its justification were proposed 
by Preisendorf er in a private conversation. Its implications 
have not yet been verified through experimentation. 
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APPENDIX A 



STATISTICAL DEFINITIONS 



1. VERIFICATION SCORES 

See the table below for the following verification 

score definitions. Total=R + S + T + U + V + W + X + Y+ Z 

A) AO - 'A naught' score - describes the probability of 
making a correct forecast given the total sample of 
observed events (also known as the Total Percent 
Correct score ) . 

AO = (R + V + Z) / Total 



B) Al - 'A one' score - describes the probability of a one 
category error which is made when a forecast is one 
category away from what was actually observed, i.e., 
category 2 forecast and either category 1, or 3 veri- 
fied. 

Al = (S + U + W + Y) / Total 



C) TS1 - Threat score - describes the reduction of 
threat of being surprised by a category 1 event. In 
terms of set theory, it is the intersection of category 
1 forecast value divided by the union of the observed 
and forecast category 1 values. 



TS1 = R / (R+S+T+U+X) 



Sample Contingency Table 
OBSERVED 



12 3 



F 1 

0 
R 
E 

C 2 

A 
S 
T 

3 



01 02 03 TOTAL 



R 


S 


T 


U 


V 


W 
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Y 


Z 



FI 

F2 

F3 
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2. PREDICTAND CATEGORIES 



Th e eq ual variance definition of the predictand 
categorie s implies that the single-trial probability of 
success of a random forecaster (the stochaster ) who is 

forecasting these categories, is not the same as that for 

the equally populous definition of the predictand catego- 
ries. The expected value of AO by the stochaster for the 
presently constructed equal- variance categories is 0.5136, 
and for the 5% upper critical value it is 0.5956. The 
expected value of A1 for the equal -variance is 0.4352, and 
the 5 7a lower critical value is 0.3532. 

These numbers are essential for an understanding of 
the verification scores in Tables VI (for AO) and VII (for 
Al). In particular, they tell us whether or not the 

observer, model and data sets have been simulated in a 

reasonable manner. For example, one would expect that the 
perfect observer, working with a good model and easy data, 
will obtain significantly high AO scores and significantly 
low Al scores for just about any reasonably competitive 
prediction scheme. This is borne out on perusal of Tables VI 
and VII. For instance, the perfect observer, using a good 
model and easy data yields, for the PDM method in test set 
1, an AO score of 0.801, far above the 5 7 0 upper critical 
value of 0.5956; moreover, the Al score, in this case, is 
0.199, far below the 5% lower critical value of 0.3532. As 
another instance for the PDM method in test set 1, a bad 
observer working with a bad model and hard data generates an 
A0 score of 0.478, which is below the 5% upper critical 
value of 0.5956, and indeed less than the average (expected) 
value of 0.5136. The Al score, in this same case, is 0.389, 
which exceeds the 5% lower critical value of 0.3532. 

Table VIII for TS1 cannot be as readily interpreted 
as the Tables for A0 and Al . This is because the average 
value of TS1 and its upper 5% critical value for the 
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stochaster are not readily derived for the equal-variance 
categories. They may, e.g., be worked out by Monte Carlo 
means for moderate sample sizes. For large sample sizes, 
asymptotic analytic estimates are possible, however, these 
will not be made here. 
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APPENDIX B 



FIGURES 
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Approach to Forecasting. 
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Fig. B.3 Correlations of the Eight Predictors of the Easy Data Set. 
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Fig. B . 4 Correlations of the Eight Predictors of the Hard Data Set. 
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Fig. B.6 Equal Sigma Interval (1700 rows in Category 2). 



APPENDIX C 



TABLES 



TABLE I 

COVARIANCE MATRICES USED IN SIMULATING THE DATA SETS 



EASY 

MATRIX M 

l^odo offli 
0.861 1.000 0.861 
0.741 0.861 1.000 
0.638 0.741 0.861 
0.549 0.638 0.741 
0.472 0.549 0.638 
0.407 0.472 0.549 
0.350 0.407 0.472 
0 . 301 0 . 350 0.407 

MATRIX K 

0 f 9(lo 0^)5 ofi^7 
0.775 0.900 0.775 
0.667 0.775 0.900 
0.574 0.667 0.775 
0.494 0.574 0.667 
0.425 0.494 0.574 
0.366 0.425 0.494 
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TABLE II 



COVARIANCE MATRICES AFTER MODIFICATION 
BY MODULAR ARITHMETIC 



EASY DATA SET 



MATRIX M 



MATRIX K 



M( 0 ) = 


1.0000 


K 1 
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TABLE III 

S T MATRICES USED TO DISTORT THE ORIGINAL SIGNAL 
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TABLE IV 

OBSERVER ERROR TABLE (OET ) 
1 2 3 



Est {mated 
category i 



e 1 1 


e 1 2 


e 1 3 


e 2 1 


e 2 2 


*23 


e 3 1 


e 3 2 


e 3 3 



1 

2 

3 



Actual 
category j 



TABLE V 

SKILLS OF THE THREE OBSERVER TYPES 



PERFECT OBSERVER 



1.00 0.00 0.00 

0.00 1.00 0.00 

0.00 0.00 1.00 



GOOD OBSERVER 



0.90 0.10 0.00 
0.10 0.80 0.10 
0.00 0.10 0.90 



BAD OBSERVER 



0.72 0.18 0.10 
0.18 0.64 0.18 
0.10 0.18 0.72 
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TABLE VII 

VERIFICATION SCORES FOR Al 
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TABLE VIII 

VERIFICATION SCORES FOR TSI 
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TABLE IX 

ANOVA TABLE FOR AO - 4 METHODS COMPARED 



SOURCE 


SUMSQ 


DF 


MS 


FOBS 


FCRIT 


PVALUE 


A 


0.1451 


3 


0.0484 


230.47 


2.682 


0.000000 


B 


0.4838 


1 


0.4838 


2305.96 


3.857 


0.000000 


C 


0.2145 


1 


0.2145 


1022.55 


3.857 


0.000000 


D 


0.4955 


2 


0.2478 


1180.87 


3.043 


0.000000 


AB 


0.0014 


3 


0.0005 


2.30 


2.682 


0.077623 


AC 


0.0027 


3 


0.0009 


4.24 


2.682 


0.004303 


AD 


0.0003 
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0.0000 


0.21 


2.212 


0.965467 


BC 


0.0020 
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0.0020 


9.38 


3.857 


0.000764 


BD 


0.0306 
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0.0153 


72.91 


3.043 


0.000000 


CD 


0.0344 
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0.0172 


82.00 


3.043 


0.000000 


ABC 


0.0037 
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0.0012 


5.82 


2.682 


0.000384 


ABD 


0.0001 
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0.0000 


0.10 


2.212 


1.019332 


ACD 


0.0003 


6 


0.0000 


0.23 


2.212 


0.954907 


BCD 


0.0003 


2 


0.0002 


0.76 


3.043 


0.495613 


ABCD 


0.0016 


6 


0.0003 


1.27 


2.212 


0.287966 


ERROR 


0.0201 


96 










TOTAL 


1.4364 


143 











CO 48.6228 MSE 0.0002 

ANOVA TABLE WITH 4 FACTORS: 
A = 4 MOS METHODS 
B = 2 DATA SETS 
C = 2 MODEL VERSIONS 
D = 3 OBSERVER TYPES 
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TABLE X 

ANOVA TABLE FOR Al - 4 METHODS COMPARED 



SOURCE 


SUMSQ 


DF 


MS 


FOBS 


FCRIT 


PVALUE 


A 


0.0422 


3 


0.0141 


66 . 15 


2.682 


0.000000 


B 


0.1626 


1 


0.1626 


765.34 


3.857 


0.000000 


C 


0.0800 


1 


0.0800 


376 . 60 


3.857 


0.000000 


D 


0.0645 


2 


0.0323 


151.86 


3.043 


0.000000 


AB 


-0.0002 


3 


-0.0001 


-0.24 


2.682 


0.852105 


AC 


0.0005 


3 


0.0002 


0.84 


2.682 


0.500000 


AD 


0.0016 


6 


0.0003 


1.29 


2.212 


0.278279 


BC 


0.0135 


1 


0.0135 


63.47 


3.857 


0.000000 


BD 


0.0216 


2 


0.0108 


50.87 


3.043 


0.000000 


CD 


0.0195 


2 


0.0098 


45.92 


3.043 


0.000000 


ABC 


0.0022 


3 


0.0007 


3.40 


2.682 


0.015532 


ABD 


0.0025 


6 


0.0004 


1.95 


2.212 


0.077145 


ACD 


0.0021 


6 


0.0004 


1.66 


2.212 


0.139072 


BCD 


0.0064 


2 


0.0032 


15.08 


3.043 


0.000000 


ABCD 


0.0011 


6 


0.0002 


0.86 


2.212 


0.542469 


ERROR 


0.0204 


96 










TOTAL 


0.4407 


143 










CO 17 . 


.4203 MSE 


0.0002 











ANOVA TABLE WITH 4 FACTORS: 
A = 4 MOS METHODS 
B = 2 DATA SETS 
C = 2 MODEL VERSIONS 
D = 3 OBSERVER TYPES 
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TABLE XI 

ANOVA TABLE FOR TS1 - 4 METHODS COMPARED 



SOURCE 


SUMSQ 


DF 


MS 


FOBS 


FCRIT 


PVALUE 


A 


0.1663 


3 


0.0554 


138.33 


2.682 


0.000000 


B 


0.5971 


1 


0.5971 


1490.11 


3.857 


0.000000 


C 


0.3342 


1 


0.3342 


834.07 


3.857 


0.000000 


D 


0.7522 


2 


0.3761 


938.60 


3.043 


0.000000 


AB 


0.0024 


3 


0.0008 


2.03 


2.682 


0.113553 


AC 


0.0048 


3 


0.0016 


3.99 


2.682 


0.006372 


AD 


0.0016 


6 


0.0003 


0.66 


2.212 


0.691236 


BC 


0.0009 


1 


0.0009 


2.32 


3.857 


0.127956 


BD 


0.0583 


2 


0.0291 


72.71 


3.043 


0.000000 


CD 


0.0627 


2 


0.0313 


78.24 


3.043 


0.000000 


ABC 


0.0056 


3 


0.0019 


4.70 


2.682 


0.002144 


ABD 


-0.0001 


6 


0.0000 


-0.05 


2.212 


1.064746 


ACD 


0.0004 


6 


0.0001 


0.15 


2.212 


0.989171 


BCD 


0.0005 


2 


0.0003 


0.63 


3.043 


0.559566 


ABCD 


0.0023 


6 


0.0004 


0.95 


2.212 


0.478893 


ERROR 


0.0385 


96 










TOTAL 


2.0277 


143 










CO 37 . 


,9683 MSE 


0.0004 











ANOVA TABLE WITH 4 FACTORS: 
A = 4 MOS METHODS 
B = 2 DATA SETS 
C = 2 MODEL VERSIONS 
D = 3 OBSERVER TYPES 
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TABLE XII 

ANOVA TABLE FOR AO - 3 METHODS COMPARED 



SOURCE 


SUMSQ 


DF 


MS 


FOBS 


FCRIT 


PVALUE 


A 


0.0007 


2 


0.0003 


1.50 


3.078 


0.244744 


B 


0.3424 


1 


0.3424 


1560.90 


3.890 


0.000000 


C 


0.1423 


1 


0.1423 


648.83 


3.890 


0.000000 


D 


0.3817 


2 


0.1908 


870.02 


3.078 


0.000000 


AB 


0.0002 


2 


0.0001 


0.45 


3.078 


0.651507 


AC 


0.0003 


2 


0.0002 


0.70 


3.078 


0.527535 


AD 


-0.0001 


4 


0.0000 


o 
1— 1 

o 

1 


2.503 


0.965831 


BC 


0.0002 


1 


0.0002 


0.77 


3.890 


0.408447 


BD 


0.0245 


2 


0.0122 


55.79 


3.078 


0.000000 


CD 


0.0276 


. 2 


0.0138 


62.85 


3 . 078 


0.000000 


ABC 


-0.0004 


2 


-0.0002 


-0.90 


3 . 078 


0.434343 


ABD 


-0.0002 


4 


-0.0001 


00 

CM 

O 

i 


2.503 


0.879029 


ACD 


0.0003 


4 


0.0001 


0.37 


2.503 


0.824636 


BCD 


0.0005 


2 


0.0002 


1.08 


3.078 


0.369748 


ABCD 


0.0008 


4 


0.0002 


0.96 


2.503 


0.456994 


ERROR 


0.0158 


72 










TOTAL 


0.9364 


107 










CO 38. 


.7965 MSE 


0.0002 











ANOVA TABLE WITH 4 FACTORS: 
A = 3 MOS METHODS 
B = 2 DATA SETS 
C = 2 MODEL VERSIONS 
D = 3 OBSERVER TYPES 






TABLE XIII 

ANOVA TABLE FOR Al - 3 METHODS COMPARED 



SOURCE 


SUMSQ 


DF 


MS 


FOBS 


FCRIT 


PVALUE 


A 


0.0011 


2 


0.0006 


2.61 


3.078 


0.075201 


B 


0.1244 


1 


0.1244 


580.57 


3.890 


0.000000 


C 


0.0607 


1 


0.0607 


283.38 


3.890 


0.000000 


D 


0.0525 


2 


0.0262 


122.51 


3.078 


0.000000 


AB 


-0.0002 


2 


-0.0001 


-0.45 


3.078 


0.652335 


AC 


0.0005 


2 


0.0003 


1.21 


3.078 


0.325188 


AD 


0.0012 


4 


0.0003 


1.41 


2.503 


0.248677 


BC 


0.0059 


1 


0.0059 


27.74 


3.890 


0.000000 


BD 


0.0214 


2 


0.0107 


50.04 


3.078 


0.000000 


CD 


0.0175 


2 


0.0088 


40.8 7 


3.078 


0.000000 


ABC 


0.0004 


2 


0.0002 


0.88 


3.078 


0.446352 


ABD 


0.0004 


4 


0.0001 


0.45 


2.503 


0.770252 


ACD 


0.0016 


4 


0.0004 


1.87 


2.503 


0.124488 


BCD 


0.0055 


2 


0.0027 


12.74 


3.078 


0.000002 


ABCD 


0.0003 


4 


0.0001 


0.40 


2.503 


0.803020 


ERROR 


0.0154 


72 










TOTAL 


0.3088 


107 










CO 12. 


,3448 MSE 


0.0002 











ANOVA TABLE WITH 4 FACTORS: 
A = 3 MOS METHODS 
B = 2 DATA SETS 
C = 2 MODEL VERSIONS 
D = 3 OBSERVER TYPES 
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TABLE XIV 

ANOVA TABLE FOR TS1 - 3 METHODS COMPARED 



SOURCE 


SUMSQ 


DF 


MS 


FOBS 


FCRIT 


PVALUE 


A 


0.0007 


2 


0.0004 


0.80 


3.078 


0.480915 


B 


0.4164 


1 


0.4164 


946.79 


3.890 


0.000000 


C 


0.2175 


1 


0.2175 


494.53 


3.890 


0.000000 


D 


0.5954 


2 


0.2977 


676.99 


3.078 


0.000000 


AB 


0.0001 


2 


0.0001 


0.14 


3.078 


0.838436 


AC 


0.0001 


2 


0.0000 


0.09 


3.078 


0.873986 


AD 


-0.0002 


4 


0.0000 


-0.09 


2.503 


0.974577 


BC 


0.0004 


1 


0.0004 


0.80 


3.890 


0.398383 


BD 


0.0446 


2 


0.0223 


50.66 


3.078 


0.000000 


CD 


0.0463 


2 


0.0231 


52.64 


3 . 078 


0.000000 


ABC 


-0.0003 


2 


-0.0001 


-0.33 


3.078 


0.722544 


ABD 


-0.0001 


4 


0.0000 


-0.04 


2.503 


0.999907 


ACD 


0.0006 


4 


0.0002 


0.36 


2.503 


0.830389 


BCD 


0.0000 


2 


0.0000 


0.02 


3.078 


0.931326 


ABCD 


0.0007 


4 


0.0002 


0.41 


2.503 


0.798596 


ERROR 


0.0317 


72 










TOTAL 


1.3538 


107 










CO 30. 


,6874 MSE 


0.0004 











ANOVA TABLE WITH 4 FACTORS: 
A = 3 MOS METHODS 
B = 2 DATA SETS 
C = 2 MODEL VERSIONS 
D = 3 OBSERVER TYPES 
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TABLE XV 





ANOVA 


TABLE FOR 


AO - PDM 


VS DISC 






SOURCE 


SUMSQ 


DF 


MS 


FOBS 


FCRIT 


P VALUE 


A 


0.0002 


1 


0.0002 


0.95 


3.958 


0.356832 


B 


0.2333 


1 


0.2333 


1036.68 


3.958 


0.000000 


C 


0.0901 


1 


0.0901 


400.14 


3.958 


0.000000 


D 


0.2572 


2 


0.1286 


571.42 


3.149 


0.000000 


AB 


0.0000 


1 


0.0000 


0.07 


3.958 


0.736694 


AC 


0.0000 


1 


0.0000 


0.07 


3.958 


0.736694 


AD 


-0.0002 


2 


-0.0001 


-0.41 


3.149 


0.679774 


BC 


0.0000 


1 


0.0000 


0.14 


3.958 


0.684829 


BD 


0.0162 


2 


0.0081 


35.90 


3.149 


0.000000 


CD 


0.0216 


2 


0.0108 


48.03 


3.149 


0.000000 


ABC 


-0.0002 


1 


-0.0002 


-0.68 


3.958 


0.436663 


ABD 


0.0001 


2 


0.0000 


0.17 


3.149 


0.818564 


ACD 


0.0002 


2 


0.0001 


0.44 


3.149 


0.658203 


BCD 


0.0005 


2 


0.0002 


1.02 


3.149 


0.394105 


ABCD 


0.0002 


2 


0.0001 


0.54 


3.149 


0.604052 


ERROR 


.0.0108 


48 










TOTAL 


0.6301 


71 










CO 25 . 


,7722 MSE 


0.0002 











ANOVA TABLE WITH 4 FACTORS: 
A = 2 METHODS (PDM VS DISC) 
B = 2 DATA SETS 
C = 2 MODEL VERSIONS 
D = 3 OBSERVER TYPES 
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TABLE XVI 





ANOVA 


TABLE FOR 


Al - PDM 


VS DISC 






SOURCE 


SUMSQ 


DF 


MS 


FOBS 


FCRIT 


P VALUE 


A 


0.0008 


1 


0.0008 


3.80 


3.958 


0.046882 


B 


0.0825 


1 


0.0825 


377.69 


3 . 958 


0.000000 


C 


0.0345 


1 


0.0345 


157.92 


3.958 


0.000000 


D 


0.0349 


2 


0.0175 


79 . 94 


3.149 


0.000000 


AB 


-0.0001 


1 


-0.0001 


-0.35 


3.958 


0.569152 


AC 


-0.0001 


1 


-0.0001 


-0.37 


3.958 


0.556986 


AD 


0.0014 


2 


0.0007 


3.16 


3.149 


0.044791 


BC 


0.0046 


1 


0.0046 


21.25 


3.958 


0.000003 


BD 


0.0154 


2 


0.0077 


35.23 


3 . 149 


0.000000 


CD 


0.0158 


2 


0.0079 


36. 18 


3.149 


0.000000 


ABC 


0.0002 


1 


0.0002 


1.13 


3.958 


0.312872 


ABD 


0.0002 


2 


0.0001 


0.37 


3 . 149 


0.699473 


ACD 


0.0003 


2 


0.0002 


0.69 


3.149 


0.530105 


BCD 


0.0035 


2 


0.0018 


8.02 


3 . 149 


0.000363 


ABCD 


0.0003 


2 


0.0002 


0.74 


3.149 


0.507911 


ERROR 


0.0105 


48 










TOTAL 


0.2047 


71 










CO 8. 


.2640 MSE 


0.0002 











ANOVA TABLE WITH 4 FACTORS: 
A = 2 METHODS (PDM VS DISC) 
B = 2 DATA SETS 
C = 2 MODEL VERSIONS 
D = 3 OBSERVER TYPES 
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TABLE XVII 





ANOVA 


TABLE FOR 


TS1 - PDM 


VS DISC 






SOURCE 


SUMSQ 


DF 


MS 


FOBS ■ 


FCRIT 


PVALUE 


A 


0.0005 


1 


0.0005 


1.01 


3.958 


0.340231 


B 


0.2840 


1 


0.2840 


628.63 


3.958 


0.000000 


C 


0.1414 


1 


0.1414 


312.96 


3 . 958 


0.000000 


D 


0.3942 


2 


0.1971 


436.34 


3.149 


0.000000 


' AB 


0.0000 


1 


0.0000 


-0.10 


3 . 958 


0.709762 


AC 


-0.0001 


1 


-0.0001 


-0.14 


3.958 


0.685636 


AD 


-0.0001 


2 


-0.0001 


-0.14 


3.149 


0.842322 


BC 


0.0000 


1 


0.0000 


0.03 


3.958 


0.770692 


BD 


0.0290 


2 


0.0145 


32.11 


3.149 


0.000000 


CD 


0.0358 


2 


0.0179 


-39.67 


3.149 


0.000000 


ABC 


0.0000 


1 


0.0000 


0.03 


3.958 


0.770692 


ABD 


0.0001 


2 


0.0001 


0.15 


3 . 149 


0.829864 


ACD 


0.0003 


2 


0.0001 


0.29 


3.149 


0.746003 


BCD 


0.0003 


2 


0.0001 


0.29 


3.149 


0.746003 


ABCD 


0.0002 


2 


0.0001 


0. 19 


3.149 


0.808177 


ERROR 


0.0217 


48 










TOTAL 


0.9072 


71 










CO 20. 


4221 MSE 


0.0005 











ANOVA TABLE WITH 4 FACTORS: 
A = 2 METHODS fPDM VS DISC) 
B = 2 DATA SETS 
C = 2 MODEL VERSIONS 
D = 3 OBSERVER TYPES 



61 






TABLE XVIII 





ANOVA 


TABLE FOR 


AO - PDM 


VS DDRR 






SOURCE 


SUMSQ 


DF 


MS 


FOBS 


FCRIT 


PVALUE 


A 


0.0004 


1 


0.0004 


1.90 


3.958 


0.178464 


B 


0.2237 


1 


0.2237 


1031.79 


3.958 


0.000000 


C 


0.0950 


1 


0.0950 


438.05 


3.958 


0.000000 


D 


0.2521 


2 


0.1260 


581.31 


3.149 


0.000000 


AB 


0.0001 


1 


0.0001 


0.49 


3.958 


0.504978 


AC 


0.0003 


1 


0.0003 


1.20 


3.958 


0.297287 


AD 


-0.0002 


2 


-0.0001 


-0.35 


3.149 


0.710845 


BC 


0.0000 


1 


0.0000 


0.14 


3.958 


0.681342 


BD 


0.0169 


2 


0.0085 


38.99 


3.149 


0.000000 


CD 


0.0182 


2 


0.0091 


41.91 


3.149 


0.000000 


ABC 


-0.0002 


1 


-0.0002 


-0.84 


3.958 


0.383821 


ABD 


-0.0001 


2 


0.0000 


-0.18 


3.149 


0.816188 


ACD 


0.0006 


2 


0.0003 


1.48 


3.149 


0.252559 


BCD 


0.0008 


2 


0.0004 


1.76 


3.149 


0.190655 


ABCD 


0.0003 


2 


0.0001 


0.63 


3.149 


0.558304 


ERROR 


0.0104 


48 










TOTAL 


0.6183 


71 










CO 25. 


8884 MSE 


0.0002 











ANOVA TABLE WITH 4 FACTORS: 
A = 2 METHODS [PDM VS DDRR) 
B = 2 DATA SETS 
C = 2 MODEL VERSIONS 
D = 3 OBSERVER TYPES 
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TABLE XIX 





ANOVA 


TABLE 


FOR A1 - PDM 


VS DDRR 






SOURCE 


SUMSQ 


DF 


MS 


FOBS 


FCRIT 


PVALUE 


A 


0.0006 


1 


0.0006 


2.92 


3.958 


0.086440 


B 


0.0814 


1 


0.0814 


392.98 


3.958 


0.000000 


C 


0.0424 


1 


0.0424 


204.75 


3.958 


0.000000 


D 


0.0404 


2 


0.0202 


97.63 


3.149 


0.000000 


AB 


-0.0001 


1 


-0.0001 


-0.45 


3.958 


0.523927 


AC 


0.0006 


1 


0.0006 


2.74 


3.958 


0.097799 


AD 


0.0002 


2 


0.0001 


0.43 


3.149 


0.662482 


BC 


0.0030 


1 


0.0030 


14.35 


3.958 


0.000073 


BD 


0.0146 


2 


0.0073 


35.35 


3.149 


0.000000 


CD 


0.0113 


2 


0.0057 


27.35 


3.149 


0.000000 


ABC 


0.0001 


1 


0.0001 


0.57 


3.958 


0.476704 


ABD 


0.0003 


2 


0.0001 


0.61 


3.149 


0.567456 


ACD 


0.0014 


2 


0.0007 


3.46 


3.149 


0.032740 


BCD 


0.0049 


2 


0.0024 


11.82 


3.149 


0.000015 


ABCD 


-0.0001 


2 


0 . 0000 


-0.17 


3.149 


0.816789 


ERROR 


0.0099 


48 










TOTAL 


0.2110 


71 











CO 8.2898 MSE 0.0002 

ANOVA TABLE WITH 4 FACTORS: 
A = 2 METHODS [PDM VS DDRR) 
B = 2 DATA SETS 
C = 2 MODEL VERSIONS 
D = 3 OBSERVER TYPES 
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TABLE XX 





ANOVA 


TABLE FOR 


TS1 - PDM 


VS DDRR 






SOURCE 


SUMSQ 


DF 


MS 


FOBS 


FCRIT 


PVALUE 


A 


0.0004 


1 


0.0004 


0.90 


3.958 


0.370369 


B 


0.2717 


1 


0.2717 


615 . 67 


3.958 


0.000000 


C 


0.1454 


1 


0.1454 


329.46 


3.958 


0.000000 


D 


0.4034 


2 


0.2017 


457.12 


3 . 149 


0.000000 


AB 


0.0000 


1 


0.0000 


-0.03 


3.958 


0.769925 


AC 


0.0000 


1 


0.0000 


0.03 


3.958 


0.769763 


AD 


-0.0002 


2 


-0.0001 


-0.24 


3.149 


0.775460 


BC 


0.0002 


1 


0.0002 


0.45 


3.958 


0.522512 


BD 


0.0313 


2 


0.0157 


35.48 


3.149 


0.000000 


CD 


0.0307 


2 


0.0154 


34.81 


3.149 


0.000000 


ABC 


-0.0001 


1 


-0.0001 


-0.14 


3 . 958 


0.683467 


ABD 


0.0001 


2 


0.0000 


0.10 


3.149 


0.862142 


ACD 


0.0009 


2 


0.0004 


1.00 


3.149 


0.399339 


BCD 


0.0005 


2 


0.0003 


0.57 


3 . 149 


0.589589 


ABCD 


0.0001 


2 


0.0001 


0.14 


3.149 


0.838830 


ERROR 


0.0212 


48 










TOTAL 


0.9055 


71 










CO 20. 


.4018 MSE 


0.0004 











ANOVA TABLE WITH 4 FACTORS: 
A = 2 METHODS (PDM VS DDRR) 
B = 2 DATA SETS 
C = 2 MODEL VERSIONS 
D = 3 OBSERVER TYPES 



TABLE XXI 

ANOVA TABLE FOR AO - DISC VS DDRR 



SOURCE 


SUMSQ 


DF 


MS 


FOBS 


FCRIT 


PVALUE 


A 


0.0003 


1 


0.0003 


1.39 


3.958 


0.258223 


B 


0.2279 


1 


0.2279 


1037.38 


3.958 


0.000000 


C 


0.0999 


1 


0.0999 


454.85 


3.958 


0.000000 


D 


0.2543 


2 


0.1272 


578.92 


3 . 149 


0.000000 


AB 


0.0002 


1 


0.0002 


0.69 


3 ..958 


0.432009 


AC 


0.0001 


1 


0.0001 


0.28 


3.958 


0.601672 


AD 


-0.0001 


2 


0.0000 


-0.21 


3 . 149 


0.796003 


BC 


0.0000 


1 


0.0000 


-0.07 


3.958 


0.735525 


BD 


0.0154 


2 


0.0077 


34.98 


3 . 149 


0.000000 


CD 


0.0150 


2 


0.0075 


34.14 


3.149 


0.000000 


ABC 


-0.0002 


1 


-0.0002 


-0.97 


3.958 


0.347797 


ABD 


0.0000 


2 


0.0000 


0.03 


3 . 149 


0.914440 


ACD 


0.0002 


2 


0.0001 


0.38 


3.149 


0.690864 


BCD 


0.0006 


2 


0.0003 


1.46 


3.149 


0.257405 


ABCD 


0.0003 


2 


0.0002 


0.76 


3.149 


0.497225 


ERROR 


0.0105 


48 










TOTAL 


0.6244 


71 











CO 25.9328 MSE 0.0002 

ANOVA TABLE WITH 4 FACTORS: 

A = 2 METHODS (DISC VS DDRR) 
B = 2 DATA SETS 
C = 2 MODEL VERSIONS 
D = 3 OBSERVER TYPES 
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TABLE XXII 





ANOVA 


TABLE FOR 


Al - DISC 


VS DDRR 






SOURCE 


SUMSQ 


DF 


MS 


FOBS 


FCRIT 


PVALUE 


A 


0.0001 


1 


0.0001 


0.55 


3.958 


0.483529 


B 


0.0848 


1 


0.0848 


390.08 


3.958 


0.000000 


C 


0.0448 


1 


0.0448 


205.99 


3.958 


0.000000 


D 


0.0301 


2 


0.0151 


69.29 


3.149 


0.000000 


AB 


-0.0001 


1 


-0.0001 


-0.44 


3.958 


0.527387 


AC 


0.0003 


1 


0.0003 


1.43 


3.958 


0.249962 


AD 


0.0004 


2 


0.0002 


0.88 


3.149 


0.446950 


BC 


0.0044 


1 


0.0044 


20.15 


3.958 


0.000004 


BD 


0.0131 


2 


0.0065 


30.07 


3.149 


0.000000 


CD 


0.0087 


2 


0.0044 


20.06 


3.149 


0.000000 


ABC 


0.0003 


1 


0.0003 


1.30 


3.958 


0.275119 


ABD 


0.0001 


2 


0.0001 


0.33 


3.149 


0.721399 


ACD 


0.0006 


2 


0.0003 


1.49 


3.149 


0.248804 


BCD 


0.0028 


2 


0.0014 


6 . 37 


3.149 


0.001721 


ABCD 


0.0002 


2 


0.0001 


0.42 


3.149 


0.669063 


ERROR 


0.0104 


48 










TOTAL 


0.2011 


71 










CO 8. 


1365 MSE 


0.0002 











ANOVA TABLE WITH 4 FACTORS: 

A = 2 METHODS [DISC VS DDRR) 
B = 2 DATA SETS 
C = 2 MODEL VERSIONS 
D = 3 OBSERVER TYPES 
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TABLE XXIII 





ANOVA 


TABLE FOR 


TS1 - DISC 


: VS DDRR 






SOURCE 


SUMSQ 


DF 


MS 


FOBS 


FCRIT 


PVALUE 


A 


0.0002 


1 


0.0002 


0.46 


3.958 


0.518262 


B 


0.2773 


1 


0.2773 


642.89 


3.958 


0.000000 


C 


0.1484 


1 


0.1484 


344.06 


3.958 


0.000000 


D 


0.3934 


2 


0.1967 


455.98 


3.149 


0.000000 


AB 


0.0001 


1 


0.0001 


0.18 


3.958 


0.658211 


AC 


0.0000 


1 


0.0000 


-0.07 


3.958 


0.734412 


AD 


-0.0002 


2 


-0.0001 


-0.18 


3.149 


0.815617 


BC 


0.0000 


1 


0.0000 


0.07 


3.958 


0.734111 


BD 


0.0284 


2 


0.0142 


32.86 


3.149 


0.000000 


CD 


0.0259 


2 


0.0130 


30.05 


3.149 


0.000000 


ABC 


0.0000 


1 


0.0000 


-0.11 


3.958 


0.706183 


ABD 


0.0001 


2 


0.0001 


0.14 


3.149 


0.836747 


ACD 


0.0003 


2 


0.0001 


0.32 


3.149 


0.727554 


BCD 


0.0002 


2 


0.0001 


0.23 


3.149 


0.780631 


ABCD 


0.0002 


2 


0.0001 


0.18 


3.149 


0.813851 


ERROR 


0.0207 


48 










TOTAL 


0.8950 


71 











CO 20.5512 MSE 0.0004 

ANOVA TABLE WITH 4 FACTORS: 

A = 2 METHODS (DISC VS DDRR) 
B = 2 DATA SETS 
C = 2 MODEL VERSIONS 
D = 3 OBSERVER TYPES 
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